LN2R a knowledge based reference reconciliation system: OAEI 2010 results
نویسندگان
چکیده
This paper presents the first participation of LN2R system in IM@OAEI2010, the Instance Matching track of Ontology Alignment Evaluation Initiative 2010 Campaign. In particular, we participated in OWL data track by performing LN2R system on Person-Restaurant data set. We obtained very good results on person data sets and reasonable results on restaurant data set. 1 Presentation of the system To design a semantic information integration system, we are faced to two reconciliation problems. First, the schema (or ontology) reconciliation which consists in finding mappings between elements (concepts or relations) of two schemas or two ontologies (see [1, 2] for surveys). The second problem concerns data reconciliation (named reference reconciliation) which consists in comparing data descriptions and deciding whether different descriptions refer to the same real world entity (e.g. the same person, the same article, the same gene). The problem of reference reconciliation is very critical, since it impacts data quality and data consistency [3]. In LN2R system, we address only the problem of reference reconciliation. There are several kinds of reference reconciliation approaches: knowledge-based, similaritybased, probabilistic, supervised, etc.[4]. In this paper we focus our study on reference reconciliation approaches that are informed and global. Informed approaches are those which exploit knowledge that is declared in the ontology to reconcile data. Reference reconciliation approaches are said global when they exploits the dependencies possibly existing between reference reconciliations [5, 6]. Such approaches use attribute values describing the data but also references that are related to the considered data [5, 6]. For example, the reconciliation between two scientists can entail the reconciliation between their two affiliated universities. Such dependencies result from the semantics of the domain of interest. 1.1 State, purpose, general statement The reference reconciliation system (LN2R) that we have tested in IM@OAEI2010 campaign is knowledge-based, unsupervised and based on two methods, a logical one called L2R and a numerical one called N2R. The Logical method for Reference Reconciliation (L2R) is based on the translation in first order logic (Horn rules) of some of the schema semantics. In order to complement the partial results of L2R, we have designed a Numerical method for Reference Reconciliation (N2R). It exploits the L2R results and allows computing similarity scores for each pair of references. Reference reconciliation problem. Let S1 and S2 be two data sources which conform to the same OWL ontology. Let I1 and I2 be the two reference sets that correspond respectively to the data of S1 and S2. The problem consists in deciding whether references are reconciled or not reconciled. Let Reconcile be a binary predicate. Reconcile(X,Y ) means that the two references denoted by X and Y refer to the same world entity. The reference reconciliation problem considered in L2R consists in extracting from the set I1× I2 of reference pairs two subsets REC and NREC such that: REC = {(i, i), Reconcile(i, i)} and NREC = {(i, i),¬Reconcile(i, i)} The reference reconciliation problem considered in N2R consists in, given a similarity function Simr : I1× I2→ 0..1, and a threshold Trec (a real value in 0..1 given by an expert, fixed experimentally or learned on a labeled data sample), computing the following set: RECN2R = {(i, i′) ∈ (I1 × I2)\(REC ∪NREC), s.t.Simr(i, i′) > Trec} 1.2 Specific techniques used In the following, we will present some details on the knowledge-based reference reconciliation system (LN2R). First, we will show through an example the ontology and the kind of knowledge that we use. Second, we give a brief presentation of the two methods L2R and N2R of reference reconciliation. The ontology and its constraints The considered OWL ontology consists of a set of classes (unary relations) organized in a taxonomy and a set of typed properties (binary relations). These properties can also be organized in a taxonomy of properties. Two kinds of properties can be distinguished in OWL: the so-called relations (in OWL abstractProperty), the domain and the range of which are classes and the so-called attributes (in OWL objectProperty), the domain of which is a class and the range of which is a set of basic values (e.g. Integer, Date, Literal). We allow the declaration of constraints expressed in OWL-DL or in SWRL in order to enrich the domain ontology. The constraints that we consider are of the following types: – Constraints of disjunction between classes: DISJOINT(C,D) is used to declare that the two classes C and D are disjoint. – Constraints of functionality of properties: PF(P) is used to declare that the property P (relation or attribute) is a functional property.
منابع مشابه
KD2R: A Key Discovery Method for Semantic Reference Reconciliation
The reference reconciliation problem consists of deciding whether different identifiers refer to the same world entity. Some existing reference reconciliation approaches use key constraints to infer reconciliation decisions. In the context of the Linked Open Data, this knowledge is not available. We propose KD2R, a method which allows automatic discovery of key constraints associated to OWL2 cl...
متن کاملRe-using Cool URIs: Entity Reconciliation Against LOD Hubs
We observe that “LOD hubs” are emerging. They provide well-managed reference identifiers that attract a large share of the incoming links on the Web of Data and play a crucial role in data integration within communities of interest. But connecting to such hubs as part of the Linked Data publishing process is still a difficult task. In this paper, we explore several approaches to the implementat...
متن کاملAML results for OAEI 2015
AgreementMakerLight (AML) is an automated ontology matching system based primarily on element-level matching and on the use of external resources as background knowledge. This paper describes its configuration for the OAEI 2015 competition and discusses its results. For this OAEI edition, we focused mainly on the Interactive Matching track due to its expansion, as handling user interactions on ...
متن کاملOntology-Driven Possibilistic Reference Fusion
It often happens that different references (i.e. data descriptions), possibly coming from different heterogeneous data sources, concern the same real world entity. In such cases, it is necessary: (i) to detect, through reconciliation methods, whether different data descriptions refer to the same real world entity and (ii) to fuse them into a unique representation. Here we assume the reference r...
متن کاملExploiting the UMLS metathesaurus in the ontology alignment evaluation initiative
In this paper we describe how the UMLS Metathesaurus—the most comprehensive effort for integrating medical thesauri and ontologies—is being used within the context of the Ontology Alignment Evaluation Initiative (OAEI). We also present the obtained results in the Large BioMed track of the OAEI 2011.5 campaign where the reference alignments are based on UMLS. Finally, we propose a new reference ...
متن کامل